Use of Combined Topic Models in Unsupervised Domain Adaptation for Word Sense Disambiguation
نویسندگان
چکیده
Topic models can be used in an unsupervised domain adaptation for Word Sense Disambiguation (WSD). In the domain adaptation task, three types of topic models are available: (1) a topic model constructed from the source domain corpus: (2) a topic model constructed from the target domain corpus, and (3) a topic model constructed from both domains. Basically, three topic features made from each topic model are added to the normal feature used for WSD. By using the extended features, SVM learns and solves WSD. However, the topic features constructed from source domain have weights describing the similarity between the source corpus and the entire corpus because the topic features made from the source domain can reduce the accuracy of WSD. In six transitions of domain adaptation using three domains, we conducted experiments by varying the combination of topic features, and show the effectiveness of the proposed method.
منابع مشابه
Research and applications: Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods
OBJECTIVE To evaluate state-of-the-art unsupervised methods on the word sense disambiguation (WSD) task in the clinical domain. In particular, to compare graph-based approaches relying on a clinical knowledge base with bottom-up topic-modeling-based approaches. We investigate several enhancements to the topic-modeling techniques that use domain-specific knowledge sources. MATERIALS AND METHOD...
متن کاملWord sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge- poor unsupervised methods
To cite: Chasin R, Rumshisky A, Uzuner O, et al. J Am Med Inform Assoc 2014;21:842–849. ABSTRACT Objective To evaluate state-of-the-art unsupervised methods on the word sense disambiguation (WSD) task in the clinical domain. In particular, to compare graphbased approaches relying on a clinical knowledge base with bottom-up topic-modeling-based approaches. We investigate several enhancements to ...
متن کاملرفع ابهام معنایی واژگان مبهم فارسی با مدل موضوعی LDA
Word sense disambiguation is the task of identifying the correct sense for the word in a given context among a finite set of possible sense. In this paper a model for farsi word sense disambiguation is presented. The model use two group of features: first, all word and stop words around target word and topic models as second features. We extract topics from a farsi corpus with Latent Dirichlet ...
متن کاملUnsupervised Domain Adaptation for Word Sense Disambiguation using Stacked Denoising Autoencoder
In this paper, we propose an unsupervised domain adaptation for Word Sense Disambiguation (WSD) using Stacked Denoising Autoencoder (SdA). SdA is an unsupervised learning method of obtaining the abstract feature set of input data using Neural Network. The abstract feature set absorbs the difference of domains, and thus SdA can solve a problem of domain adaptation. However, SdA does not always c...
متن کاملWord Sense Disambiguation in Clinical Text
Lexical ambiguity, the ambiguity arising from a string with multiple meanings, is pervasive in language of all domains. Word sense disambiguation (WSD) and word sense induction (WSI) are the tasks of resolving this ambiguity. Applications in the clinical and biomedical domain focus on the potential disambiguation has for information extraction. Most approaches to the problem are unsupervised or...
متن کامل